智能论文笔记

Novel Class Discovery without Forgetting

K J Joseph , Sujoy Paul , Gaurav Aggarwal , Soma Biswas , Piyush Rai , Kai Han , Vineeth N Balasubramanian

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-21

通过利用和适应到目前为止获得的知识，人类具有识别和区分他们不熟悉的实例的天生能力。重要的是，他们实现了这一目标，而不会在早期学习中恶化表现。受此启发，我们识别并制定了NCDWF的新的，务实的问题设置：新颖的类发现而无需忘记，哪个任务是机器学习模型从未标记的数据中逐步发现实例的新颖类别，同时在先前看到的类别上保持其性能。我们提出1）一种生成伪内表示的方法，该表示的代理（不再可用）标记的数据，从而减轻遗忘的遗忘，2）基于相互信息的正常化程序，可以增强对新型类别的无聊发现，而3）a 3）当测试数据包含所见类别和看不见的类别的实例时，简单的已知类标识符可以有助于广义推断。我们介绍了基于CIFAR-10，CIFAR-100和IMAGENET-1000的实验协议，以衡量知识保留和新型类发现之间的权衡。我们广泛的评估表明，现有的模型在确定新类别的同时灾难性地忘记了先前看到的类别，而我们的方法能够有效地在竞争目标之间平衡。我们希望我们的工作能够吸引对这个新确定的实用问题设定的进一步研究。

translated by 谷歌翻译

Test-time adaptation with slot-centric models

Mihir Prabhudesai , Anirudh Goyal , Sujoy Paul , Sjoerd van Steenkiste , Mehdi S. M. Sajjadi , Gaurav Aggarwal , Thomas Kipf , Deepak Pathak , Katerina Fragkiadaki

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2022-03-21

Current supervised visual detectors, though impressive within their training distribution, often fail to segment out-of-distribution scenes into their constituent entities. Recent test-time adaptation methods use auxiliary self-supervised losses to adapt the network parameters to each test example independently and have shown promising results towards generalization outside the training distribution for the task of image classification. In our work, we find evidence that these losses can be insufficient for instance segmentation tasks, without also considering architectural inductive biases. For image segmentation, recent slot-centric generative models break such dependence on supervision by attempting to segment scenes into entities in a self-supervised manner by reconstructing pixels. Drawing upon these two lines of work, we propose Slot-TTA, a semi-supervised instance segmentation model equipped with a slot-centric inductive bias, that is adapted per scene at test time through gradient descent on reconstruction or novel view synthesis objectives. We show that test-time adaptation in Slot-TTA greatly improves instance segmentation in out-of-distribution scenes. We evaluate Slot-TTA in several 3D and 2D scene instance segmentation benchmarks and show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors and self-supervised test-time adaptation methods.

translated by 谷歌翻译

Unsupervised Adaptation of Semantic Segmentation Models without Source Data

Sujoy Paul , Ansh Khurana , Gaurav Aggarwal

分类：计算机视觉

2021-12-04

我们考虑了源模型的无监督域适应的新问题，而无需访问语义分段的源数据。无监督的域适配旨在使标记为源数据的模型调整到新的未标记目标数据集。现有方法假设源数据在自适应期间与目标数据一起使用。但是，在实际情况下，由于在本工作中的原因，我们只能访问源模型和未标记的目标数据，但不是标记的来源，我们提出了一种自我训练方法从源模型中提取知识。要弥补从源到目标的分发班次，我们首先使用未标记的目标数据更新网络的标准化参数。然后我们采用信心过滤的伪标签，并强制执行某些转换。尽管非常简单直观，但我们的框架能够在我们广泛的实验和消融研究中直接应用于目标数据的源模型来实现显着的性能。事实上，性能只是几个远离最近的最先进的方法，它使用源数据进行适应。我们进一步展示了完全测试时间适应设置的所提出方法的恒定性，在那里我们不需要任何目标培训数据并仅在测试时适应。

translated by 谷歌翻译

SITA: Single Image Test-time Adaptation

Ansh Khurana , Sujoy Paul , Piyush Rai , Soma Biswas , Gaurav Aggarwal

分类：计算机视觉

2021-12-04

在测试时间适应（TTA）中，给定在某些源数据上培训的模型，目标是使其适应从不同分布的测试实例更好地预测。至关重要的是，TTA假设从目标分布到Finetune源模型，无法访问源数据或甚至从目标分布到任何其他标记/未标记的样本。在这项工作中，我们考虑TTA在更务实的设置中，我们称为SITA（单图像测试时间适应）。这里，在制作每个预测时，该模型只能访问给定的\ emph {单}测试实例，而不是实例的\ emph {批次}。通常在文献中被考虑。这是由逼真的情况激励，其中在按需时尚中需要推断，可能不会被延迟到“批量 - iFY”传入请求或者在没有范围的边缘设备（如移动电话中）发生推断批处理。 SITA的整个适应过程应在推理时间发生时非常快。为了解决这个问题，我们提出了一种新颖的AUGBN，用于仅需要转发传播的SITA设置。该方法可以为分类和分段任务的单个测试实例调整任何特征训练模型。 AUGBN估计仅使用具有标签保存的转换的一个前进通过的给定测试图像的看不见的测试分布的正常化统计。由于AUGBN不涉及任何反向传播，与其他最近的方法相比，它显着更快。据我们所知，这是仅使用单个测试图像解决此硬调整问题的第一个工作。尽管非常简单，但我们的框架能够在我们广泛的实验和消融研究中对目标实例上应用源模型来实现显着的性能增益。

translated by 谷歌翻译

Learning Few-shot Open-set Classifiers using Exemplar Reconstruction

Sayak Nag , Dripta S. Raychaudhuri , Sujoy Paul , Amit K. Roy-Chowdhury

分类：计算机视觉

2021-07-31

我们研究了如何在只有几个类别（几次拍摄设置）给出的一些样本时识别来自Unseen类别（开放式分类）的样本的问题。学习良好抽象的挑战是一个非常少数样本的课程使得从看不见的类别中检测样本非常困难;因此，开放式识别在少量拍摄设置中受到最小的关注。大多数开放式少量拍摄分类方法正规化SoftMax得分以表明开放类样本的均匀概率，但我们认为这种方法通常是不准确的，特别是在细粒度。相反，我们提出了一种新颖的示例性重建的元学习策略，用于共同检测开放类样本，以及通过基于度量的分类对来自观众的样本进行分类。充当类的代表的示例可以在训练数据集中提供或在特征域中估计。我们的框架，名为重建示例的基于少量拍摄的少量开放式分类器（Refofs），在各种数据集上测试，实验结果明确突出了我们作为新技术的方法。

translated by 谷歌翻译

Mapping Knowledge Representations to Concepts: A Review and New Perspectives

Lars Holmberg , Paul Davidsson , Per Linde

分类：人工智能 | 机器学习

2022-12-31

The success of neural networks builds to a large extent on their ability to create internal knowledge representations from real-world high-dimensional data, such as images, sound, or text. Approaches to extract and present these representations, in order to explain the neural network's decisions, is an active and multifaceted research field. To gain a deeper understanding of a central aspect of this field, we have performed a targeted review focusing on research that aims to associate internal representations with human understandable concepts. In doing this, we added a perspective on the existing research by using primarily deductive nomological explanations as a proposed taxonomy. We find this taxonomy and theories of causality, useful for understanding what can be expected, and not expected, from neural network explanations. The analysis additionally uncovers an ambiguity in the reviewed literature related to the goal of model explainability; is it understanding the ML model or, is it actionable explanations useful in the deployment domain?

translated by 谷歌翻译

On Implicit Bias in Overparameterized Bilevel Optimization

Paul Vicol , Jonathan Lorraine , Fabian Pedregosa , David Duvenaud , Roger Grosse

分类：机器学习

2022-12-28

Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.

translated by 谷歌翻译

Brain Cancer Segmentation Using YOLOv5 Deep Neural Network

Sudipto Paul , Dr. Md Taimur Ahad , Md. Mahedi Hasan

分类：计算机视觉

2022-12-27

An expansion of aberrant brain cells is referred to as a brain tumor. The brain's architecture is extremely intricate, with several regions controlling various nervous system processes. Any portion of the brain or skull can develop a brain tumor, including the brain's protective coating, the base of the skull, the brainstem, the sinuses, the nasal cavity, and many other places. Over the past ten years, numerous developments in the field of computer-aided brain tumor diagnosis have been made. Recently, instance segmentation has attracted a lot of interest in numerous computer vision applications. It seeks to assign various IDs to various scene objects, even if they are members of the same class. Typically, a two-stage pipeline is used to perform instance segmentation. This study shows brain cancer segmentation using YOLOv5. Yolo takes dataset as picture format and corresponding text file. You Only Look Once (YOLO) is a viral and widely used algorithm. YOLO is famous for its object recognition properties. You Only Look Once (YOLO) is a popular algorithm that has gone viral. YOLO is well known for its ability to identify objects. YOLO V2, V3, V4, and V5 are some of the YOLO latest versions that experts have published in recent years. Early brain tumor detection is one of the most important jobs that neurologists and radiologists have. However, it can be difficult and error-prone to manually identify and segment brain tumors from Magnetic Resonance Imaging (MRI) data. For making an early diagnosis of the condition, an automated brain tumor detection system is necessary. The model of the research paper has three classes. They are respectively Meningioma, Pituitary, Glioma. The results show that, our model achieves competitive accuracy, in terms of runtime usage of M2 10 core GPU.

translated by 谷歌翻译

Large Language Models Encode Clinical Knowledge

Karan Singhal , Shekoofeh Azizi , Tao Tu , S. Sara Mahdavi , Jason Wei , Hyung Won Chung , Nathan Scales , Ajay Tanwani , Heather Cole-Lewis , Stephen Pfohl

分类：自然语言处理

2022-12-26

Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.

translated by 谷歌翻译

Generalizable Natural Language Processing Framework for Migraine Reporting from Social Media

Yuting Guo , Swati Rajwal , Sahithi Lakamana , Chia-Chun Chiang , Paul C. Menell , Adnan H. Shahid , Yi-Chieh Chen , Nikita Chhabra , Wan-Ju Chao , Chieh-Ju Chao

分类：自然语言处理

2022-12-23

Migraine is a high-prevalence and disabling neurological disorder. However, information migraine management in real-world settings could be limited to traditional health information sources. In this paper, we (i) verify that there is substantial migraine-related chatter available on social media (Twitter and Reddit), self-reported by migraine sufferers; (ii) develop a platform-independent text classification system for automatically detecting self-reported migraine-related posts, and (iii) conduct analyses of the self-reported posts to assess the utility of social media for studying this problem. We manually annotated 5750 Twitter posts and 302 Reddit posts. Our system achieved an F1 score of 0.90 on Twitter and 0.93 on Reddit. Analysis of information posted by our 'migraine cohort' revealed the presence of a plethora of relevant information about migraine therapies and patient sentiments associated with them. Our study forms the foundation for conducting an in-depth analysis of migraine-related information using social media data.

translated by 谷歌翻译